Learning Visual Models for Lip Reading

نویسندگان

  • CHRISTOPH BREGLER
  • STEPHEN M. OMOHUNDRO
چکیده

This chapter describes learning techniques that are the basis of a "visual speech recognition" or "lipreading" system 1 • Model-based vision systems currently have the best performance for many visual recognition tasks. For geometrically simple domains, models can sometimes be constructed by hand using CAD-like tools. Such models are difficult and expensive to construct, however, and are inadequate for more complex domains. To do model-based lipreading, we would like a parameterized model of the com­ plex "space of lip configurations". Rather than building such a model by hand, our approach is to have the system itself build it using machine learning. The system is given a collection of training images which it uses to automatically construct the models that are later used in recognition. There are several phases of processing involved in our system. Ulti­ mately, the recognition of the time sequence of images is performed using Hidden Markov Model technology similar to that used in speech recogni­ tion. Unlike speech recognition, however, there are extra phases to find,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Learning for Lip Reading using Audio-Visual Information for Urdu Language

Human lip-reading is a challenging task. It requires not only knowledge of underlying language but also visual clues to predict spoken words. Experts need certain level of experience and understanding of visual expressions learning to decode spoken words. Now-a-days, with the help of deep learning it is possible to translate lip sequences into meaningful words. The speech recognition in the noi...

متن کامل

بررسی انتخاب سبک‌های یادگیری براساس مدل ‌وارک در دانشجویان رشته‌های پزشکی

Introduction: VARK learning styles are included visual, listening, reading and writing and performance styles or movement (learning by touching, hearing, smelling, tasting and seeing) styles. The aim of this study was to determine students' learning styles preference and their relationships in Medical Sciences students. Materials and Methods: This was a Cross-sectional study in which 80 ...

متن کامل

Lip-reading from parametric lip contours for audio- visual speech recognition

This paper describes the incorporation of a visual lip tracking and lip-reading algorithm that utilizes the affine-invariant Fourier descriptors from parametric lip contours to improve the audio-visual speech recognition systems. The audio-visual speech recognition system presented here uses parallel hidden Markov models (HMMs), where a joint decision, using an optimal decision rule, is made af...

متن کامل

A model for the dynamics of articulatory lip movements

The present work is part of a framework to design and implement a language laboratory for speech reading/lip reading for multiple languages. It is based on the interdisciplinary project LIPPS at Technical University of Berlin, Germany, which aims to develop a training-aid for speech reading by employing a text-driven facial animation from a single passport photo with the help of 2D image morphi...

متن کامل

The challenge of multispeaker lip-reading

In speech recognition, the problem of speaker variability has been well studied. Common approaches to dealing with it include normalising for a speaker’s vocal tract length and learning a linear transform that moves the speaker-independent models closer to to a new speaker. In pure lip-reading (no audio) the problem has been less well studied. Results are often presented that are based on speak...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009